massive language model
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
We show for the first time that large-scale generative pretrained transformer (GPT) family models can be pruned to at least 50% sparsity in one-shot, without any retraining, at minimal loss of accuracy. This is achieved via a new pruning method called SparseGPT, specifically designed to work efficiently and accurately on massive GPT-family models. We can execute SparseGPT on the largest available open-source models, OPT-175B and BLOOM-176B, in under 4.5 hours, and can reach 60% unstructured sparsity with negligible increase in perplexity: remarkably, more than 100 billion weights from these models can be ignored at inference time. SparseGPT generalizes to semi-structured (2:4 and 4:8) patterns, and is compatible with weight quantization approaches. The code is available at: https://github.com/IST-DASLab/sparsegpt.
- Europe > Austria (0.04)
- North America > United States > New Jersey (0.04)
Nvidia makes massive language model available to enterprises
Let the OSS Enterprise newsletter guide your open source journey! At its fall 2021 GPU Technology Conference (GTC) today, Nvidia announced that it's making Megatron 530B, one of the world's largest language models, available to enterprises for training to serve new domains and languages. First detailed in early October, Megatron 530B -- also known as Megatron-Turing Natural Language Generation (MT-NLP) -- contains 530 billion parameters and achieves high accuracy in a broad set of natural language tasks, including reading comprehension, commonsense reasoning, and natural language inference. "Today, we provide recipes for customers to build, train, and customize large language models, including Megatron 530B. This includes scripts, code, and 530B untrained model. Customers can start from smaller models and scale up to larger models as they see fit," Nvidia VP of AI software product management Kari Briski told VentureBeat via email.
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.05)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.05)
- Asia > China > Beijing > Beijing (0.05)
AI21 Labs has trained a massive language model to give a harsh rivalry to OpenAI's GPT-3
AI21 Labs: OpenAI's GPT-3 is the better part of a year and remained among the largest Artificial Intelligence system in the terms of language models which is ever been created or came into existence. With the help of an API, it has become so easy to use that people are using it for automatically writing the articles and emails along with summarizing the texts, composition of poetries and recipes, generating the codes for deep learning in Python, and creating layouts and templates for websites. But now an Artificial Intelligence lab is based in Tel Aviv, Israel which is named AI21 Labs which stated that they are planning to release a larger model and make it available via a service with the idea of being challenged by OpenAI's dominance in the Natural Language Processing as a service for the development of the Artificial Intelligence field. The startup stated that the largest version of their Artificial Intelligence model is known as Jurassic-1 Jumbo which contains 178 billion parameters and more than 3 billion GPT-3. Taking a look towards Artificial Intelligence along with machine learning parameters are the most important part of the model that is learned from historical training data.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.85)
AI21 Labs trains a massive language model to rival OpenAI's GPT-3
The Transform Technology Summits start October 13th with Low-Code/No Code: Enabling Enterprise Agility. For the better part of a year, OpenAI's GPT-3 has remained among the largest AI language models ever created, if not the largest of its kind. Via an API, people have used it to automatically write emails and articles, summarize text, compose poetry and recipes, create website layouts, and generate code for deep learning in Python. But an AI lab based in Tel Aviv, Israel -- AI21 Labs -- says it's planning to release a larger model and make it available via a service, with the idea being to challenge OpenAI's dominance in the "natural language processing-as-a-service" field. The startup says that the largest version of its model -- called Jurassic-1 Jumbo -- contains 178 billion parameters, or 3 billion more than GPT-3 (but not more than PanGu-Alpha, HyperCLOVA, or Wu Dao 2.0).
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.24)
- North America > United States > New York (0.15)
- North America > United States > Massachusetts (0.04)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.82)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.85)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.85)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)